PATENT 

Atty. Docket No.: SONY-27300 



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



In re Application of: 



Group Art Unit: 2621 



Mikhail Dorojevets et al. 



Examiner: Holder, Anner N 



Serial No.: 10/816,391 



Filed: March 3 1 , 2004 



REPLY BRIEF IN RESPONSE TO 
EXAMINER'S ANSWER 



For: 2D BLOCK PROCESSING 
ARCHITECTURE 



162 N. Wolfe Rd. 
Sunnyvale, CA 94086 
(408) 530-9700 
Customer No. 28960 



Mail Stop Appeal Brief - Patents 
Commissioner for Patents 
P.O. Box 1450 
Alexandria, VA 22313-1450 



Sir: 



In reply to the Examiner's Answer mailed on June 10, 2009, this Reply Brief is hereby 
submitted to the Board of Patent Appeals and Interferences in compliance with the requirements 
of 37 C.F.R. § 41.41. Claims 1-51 (including the independent claims 1, 17, 29, 41) have been 
rejected. 

The burden of establishing a prima facie case of obviousness has not been met to support 
the rejections. 

Appellants contend that the rejection of Claims 1-51 is in error and should be overcome 
by the appeal in the application referenced above. Appellants further contend that the Heaton, 
Taylor, Buchholz and Agarwal patents, applied either separately or together, do not support the 
rejection of Claims 1-51. In view of the foregoing, Appellants respectfully submit this Reply 
Brief, wherein: 

the STATUS OF THE CLAIMS, begins on page 2; 

the GROUNDS FOR REJECTION, begin on page 3; and 

ARGUMENTS, begin on page 4 of this paper. 
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STATUS OF THE CLAIMS 

Claims 1-51 are involved in the appeal. 

Claims 1,2, 7, 8, 11, 13-16, 41,44, 50 and 51 stand rejected under 35 U.S.C. § 103(a) as 
being unpatentable over A Bit-Serial VLSI Array Processing Chip for Image Processing, IEEE 
Journal of Solid-State Circuits, Vol. 25, No. 2, April 1990 to Heaton et al. (hereinafter "Heaton"). 

Claims 3, 4, 9, 10, 42, 43, 45 and 46 stand rejected under 35 U.S.C. § 103(a) as being 
unpatentable over Heaton in view of U.S. Patent No. 4,992,933 to Taylor (hereinafter "Taylor"). 

Claims 12and49 stand rejected under 35 U.S.C. § 103(a) as being unpatentable over Heaton 
in view of U.S. Patent No. 4,745,547 to Buchholz (hereinafter "Buchholz"). 

Claims5, 6, 15, 47 and48 stand rejected under 35 U.S.C. § 103(a) as being unpatentable over 
Heaton in view of U.S. Patent No. 5,680,338 to Agarwal et al. (hereinafter "Agarwal"). 

Claims 17-26, 28-38 and 40 stand rejected under 35 U.S.C. § 103(a) as being unpatentable 
over Heaton in view of Taylor and further in view of Agarwal. 

Claims 27 and 39 stand rejected under 35 U.S.C. § 103(a) as being unpatentable over Heaton 
in view of Taylor in view of Agarwal and further in view of Buchholz. 

Within the Appeal Brief, the rejections of Claims 1-51 are appealed. 
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GROUNDS OF REJECTION AND MATTERS TO BE REVIEWED ON APPEAL 

The following issues were presented in the Appeal Brief for review by the Board of 
Patent Appeals and Interferences: 

1 . Whether Claims 1 , 2, 7, 8, 1 1 , 1 3- 1 6, 4 1 , 44, 50 and 5 1 are properly rejected 
under 35 U.S.C. § 103(a) as being unpatentable over Heaton. 

2. Whether Claims 3, 4, 9, 10, 42, 43, 45 and 46 are properly rejected under 35 
U.S.C. § 103(a) as being unpatentable over Heaton in view of Taylor. 

3. Whether Claims 12 and 49 are properly rejected under 35 U.S.C. § 103(a) as 
being unpatentable over Heaton in view of Buchholz. 

4. Whether Claims 5, 6, 15, 47 and 48 are properly rejected under 35 U.S.C. § 103(a) 
as being unpatentable over Heaton in view of Agarwal. 

5. Whether Claims 17-26, 28-38 and 40 are properly rejected under 35 U.S.C. § 
103(a) as being unpatentable over Heaton in view of Taylor and further in view of 
Agarwal. 

6. Whether Claims 27 and 39 are properly rejected under 35 U.S.C. § 103(a) as 
being unpatentable over Heaton in view of Taylor in view of Agarwal and further 
in view of Buchholz. 
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ARGUMENT 
I. SUMMARY OF THE CLAIMED INVENTION 

The invention disclosed in the present application number 10/816,391 is directed to a 
video platform architecture for video processing that includes complex video 
compression/decompression algorithms in a computer with a two-dimensional Single-Instruction 
Multiple-Data (SIMD) array architecture. The video platform architecture includes one or more 
video processing modules, on-chip shared memory, and a general-purpose RISC central 
processing unit CPU used as a system controller. Each video processing module includes a 
rectangular array of processing elements (PEs), a block load/store unit and a global-accumulation 
unit. Video to be processed is configured into blocks of data, and a general-purpose CPU is used 
as a local controller. A plurality of registers are provided in the processing elements and the 
block load/store unit to support two-dimensional processing of the data blocks. Types of 
registers used include block registers, vector registers, scalar registers, and exchange registers. 
Each of these registers is designed to hold a short ordered one- or two-dimensional set of video 
data (data blocks). These registers are arranged in a hierarchical configuration along the data 
flow path between the on-chip memory and processing units within the PE array. 

Each of the claims being appealed includes a limitation specifying a global accumulation 
unit to accumulate the results of the processing operations for each processing element. As will 
be discussed in detail below, the cited references do not disclose, teach, or even suggest a global 
accumulation unit to accumulate the results of the processing operations for each processing 
element. 
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II. THE CITED REFERENCES DO NOT DISCLOSE, TEACH, OR EVEN 
SUGGEST EACH AND EVERY ELEMENT OF THE CLAIMS 

Appellants respectfully submit that the cited references, including Heaton, Taylor, 
Agarwal and Buchholz simply do not disclose, teach, or even suggest a global accumulation unit 
to accumulate the results of the processing operations for each processing element. 

Within the Examiner's Answer, in the Response to Arguments section, it has been stated 

that: 

Heaton discloses in fig 6 a global address which is provided to the local 
processing element. The presence of a global address is an indication of [the] 
processing elements are accumulating on a global scale. Further, Heaton specifies 
that each processing element performs the global operations. [Heaton: page 364 I. 
Introduction line 16; page 367 second column lines 3-4] ... The Examiner would 
like to point out that Heaton discloses a global accumulation for each processing 
element [Heaton: page 364 I. Introduction line 16] meeting the limitations of 
Appellant's claim 1 step 3 [Brief: page 20 lines 15-16] Being that step 3 of 
Appellant claim limitation as written corresponds to a global operation within 
each processing element. [Examiner's Answer, Page 18] 

Appellants respectfully disagree with several issues with this response. 

Specifically, the Examiner's statement that, "step 3 of Appellant claim limitation as 
written corresponds to a global operation within each processing element," shows that there is a 
misunderstanding of the claimed invention. The Examiner's statement appears to be stating that 
each processing element performs a global operation. However, the limitation of Claim 1 to 
which the Examiner is referring actually claims: "a global accumulation unit to accumulate the 
results of the processing operations for each processing element." [Claim 1, Present Application] 
Thus, this limitation of Claim 1 clearly claims processing elements which produce results, and a 
global accumulation unit which accumulates the results of the processing elements. Clearly, the 
Examiner's interpretation of the claimed invention: "a global operation within each processing 
element," is incorrect, (emphasis added) Since the Examiner's arguments are based on the 
misinterpretation of the claim, the Examiner's rejection of the claimed invention is incorrect, and 
the claims should be allowed. 

Furthermore, the cited section of Heaton [Introduction, line 16] which teaches, "[e]ach 
PE then performs the global operation on its own data," is clearly different from the claimed 
invention, and moreover, teaches away from the claimed invention. The claimed invention, 
specifically Claim 1, claims, "a programmable array of processing elements, each processing 
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element including local registers to provide data used in processing operations and to store 
results of the processing operations." Claim 1 also claims as a separate limitation, "a global 
accumulation unit to accumulate the results of the processing operations for each processing 
element." It is clear that there are separate claim elements in the claimed invention: "a 
programmable array of processing elements" and "a global accumulation unit;" whereas, Heaton 
teaches each processing element performs a global operation on its own data which clearly leaves 
out a global accumulation unit. Further, based on the Examiner's arguments, there is no global 
accumulation unit because the global operations are performed within each processing unit. 

Additionally, the mere presence of the global address in Figure 6 does not indicate the 
processing elements are accumulating on a global scale. This is yet another limitation that is not 
taught by Heaton. 

Heaton teaches an array processing chip which integrates many processing elements on a 
single die. Each processing element has several components including a 16-function logical unit, 
an adder, a shift register and local RAM. [Heaton, Abstract] The shift register is used as a local 
accumulator to hold arithmetic operands. [Heaton, page 366, 1(2] Heaton also teaches: 

An OR tree is connected to all PE's on the chip, enabling the values 
presented on each of the 128 PE data buses to be ORcd together. This feature 
enables the user to quickly test for a "true" bit in any of the PE's of the array. The 
OR tree is useful in associative operations and in performing data searches. OR 
tree operations arc pipelined. The single OR pin output is open drain, enabling 
several BLITZEN chips to be directly wire ORed together. [Heaton, page 367, |2] 

The Examiner's Answer also cites Figure 6 of Heaton. Figure 6 of Heaton shows a processing 
element with a final output to a SUM-OR tree. There is no hint, teaching or suggestion that a 
global accumulator is implemented. Further, the Introduction of Heaton is also cited within the 
Examiner's Answer. However, the Introduction of Heaton merely describes parallel processing 
systems in general and provides no hint, teaching or suggestion regarding a global accumulator. 
Further, as described above, the Introduction of Heaton actually teaches away from the claimed 
invention since it teaches that each processing element performs a global operation. Thus, 
Heaton does not teach, hint or suggest a global accumulation unit to accumulate the results of the 
processing operations for each processing element. 

It was asserted within the Examiner's Answer that since Heaton teaches local 
accumulation, it would have been obvious and is fairly suggested by the reference to store results 
for each processing element in one central location (global accumulation unit). The Appellants 
respectfully disagree with this assertion. The Examiner's assertion is based on improper 
hindsight reasoning. Heaton does not suggest, teach or disclose a global accumulation unit as 
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claimed. Instead, Heaton teaches a local accumulation unit that is a two operand adder. Heaton 
also teaches an OR tree. However, a global accumulation unit, as suggested by the Examiner, 
would then require an additional adder and a shift register, similar to those used in Heaton's local 
accumulation unit, as described on page 366, %l and Figure 6. Such a global accumulation unit, 
however, would not be able to accumulate the results of the precessing operations for each 
processing element, as taught in the present specification; rather, this global accumulation unit 
would only add two results together. Further, Heaton's OR tree is connected to all PE's on the 
chip, enabling the values presented on each of the 128 data buses to be ORed together. Heaton 
teaches that OR tree operations are pipelined. As such, the OR tree performs a logical OR 
operation, not an accumulation operation. Accordingly, the Appellants respectfully submit that 
Heaton does not suggest, teach or disclose a global accumulation unit as claimed. 

In contrast to Heaton, the presently claimed invention is directed to a video platform 
architecture for video processing which includes complex video compression/decompression 
algorithms in a computer with a two-dimensional Single-Instruction Multiple-Data (SIMD) array 
architecture. The video platform architecture includes one or more video processing modules, 
on-chip shared memory, and a general-purpose RISC central processing unit CPU used as a 
system controller. Each video processing module includes a rectangular array of processing 
elements (PEs), a block load/store unit and a global-accumulation unit. Video to be processed is 
configured into blocks of data and a general-purpose CPU used as a local controller. A plurality 
of registers are provided in the processing elements and the block load/store unit to support two- 
dimensional processing of the data blocks. Types of registers used include block registers, vector 
registers, scalar registers, and exchange registers. Each of these registers is designed to hold a 
short ordered one- or two-dimensional set of video data (data blocks). These registers are 
arranged in a hierarchical configuration along the data flow path between the on-chip memory 
and processing units within the PE array. [Present Specification, Abstract] 

Furthermore, in some embodiments, the global accumulation unit includes 4 slice 
accumulation (SACC) registers, 1 global PE mask control register, and 1 global accumulation 
(GACC) register. In some embodiments, there is one SACC register for each vertical PE slice of 
the PE array. The SACC registers are the intermediate registers in the operations moving data 
from the LACC register of each PE to the GACC register. In some embodiments, there are 4 40- 
bit SACC registers in the global accumulation unit. Each of the SACC registers includes three 
individually written sections, namely low 16-bits, middle 16-bits, and high 8-bits. Each PE's 40- 
bit LACC is read in steps, specifying which part of the LACC, low 16-bits, middle 16-bits, or 
high 8-bits, is to be placed on the 16-bit bus to the global accumulation unit, and finally into 
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corresponding section of the appropriate SACC register. During operation of the global 
accumulation unit, either the full 40-bit values or packed 20-bit values of the SACC register 
involved in the accumulation operations are added together by a global add instruction and a 
global add and accumulate instruction. The GACC register is used to perform global 
accumulation of LACC values from multiple PEs loaded into the corresponding SACC registers. 
In some embodiments, there is one 48-bit GACC register in the global accumulation unit. 
[Present Specification, page 15, line 24 through page 16, line 8] As described above, Heaton does 
not teach a global accumulation unit to accumulate the results of the processing operations for 
each processing element. 

The burden of establishing a prima facie case of obviousness based on the teachings of 
Heaton, Taylor, Buchholz and Agarwal has not been met by the Examiner because these 
references, either singularly or in combination, do not disclose or make obvious all claim 
limitations in each of Appellants' independent claims. 

III. CONCLUSION 

Each of the claims pending within this appeal include limitations specifying a global 
accumulation unit to accumulate the results of the processing operations for each processing 
element. There is nothing in the teachings of the cited references that support the rejections of 
claims with such limitations. To support the rejection of the pending claims, the Examiner has 
made an attempt to allege obviousness, but this has been done improperly. As discussed in detail 
above, the references and their combination do not disclose, teach, or even suggest the limitations 
of the pending claims. In view of the foregoing, it is respectfully submitted that Claims 1-51 
(including the independent claims 1, 17, 29, 41) are allowable over the teachings of the cited 
references. Therefore, review of this appeal and a favorable indication is respectfully requested. 

Respectfully submitted, 
HAVERSTOCK & OWENS LLP 



Dated: July 27, 2009 By: /Jonathan O. Owens/ 

Jonathan O. Owens 
Reg. No. 37,902 
Attorneys for Appellants 



